引力波天文学是一个充满活力的领域,它利用经典和现代数据处理技术来理解宇宙。已经提出了各种方法来提高检测方案的效率,层次匹配的过滤是一个重要的策略。同时,深度学习方法最近已经证明了与匹配的过滤方法和显着统计性能的一致性。在这项工作中,我们提出了分层检测网络(HDN),这是一种新型的有效检测方法,结合了分层匹配和深度学习的思想。使用新型损失函数对网络进行了训练,该功能同时编码统计准确性和效率的目标。我们讨论了提出的模型的复杂性降低的来源,并描述了专门在不同区域的每个层的初始化的一般配方。我们使用开放的LiGO数据和合成注射的实验证明了HDN的性能,并使用两层型号观察$ 79 \%$ $效率的增益,而匹配的过滤率则以$ 0.2 \%$ $的匹配过滤率。此外,我们展示了如何使用两层模型初始化的三层HDN训练三层HDN可以进一步提高准确性和效率,从而突出了多个简单层在有效检测中的功能。
translated by 谷歌翻译
胎儿超声(US)中胎盘的自动分割由于(i)(i)胎盘外观的高度多样性而具有挑战性我们禁止在妊娠晚期进行整个胎盘评估的观点。在这项工作中,我们通过多任务学习方法解决了这三个挑战,该方法结合了单个卷积神经网络中胎盘位置(例如,前,后部)和语义胎盘分段的分类。通过分类任务,模型可以从更大,更多样化的数据集中学习,同时在有限的训练集条件下提高分割任务的准确性。通过这种方法,我们研究了多个评估者的注释的变异性,并表明我们的自动分割(前胎盘的骰子为0.86,后胎盘的骰子为0.83),与观察者内和观察者间的变异性相比,我们的自动段性能达到了人级的性能。最后,我们的方法可以使用由三个阶段组成的多视图US采集管道提供整个胎盘分割:多探针图像采集,图像融合和图像分段。这会导致对较大结构(例如胎盘中的胎盘)的高质量分割,其图像伪像降低,这超出了单个探针的视野。
translated by 谷歌翻译
深层生成模型已成为检测数据中任意异常的有前途的工具,并分配了手动标记的必要性。最近,自回旋变压器在医学成像中取得了最先进的性能。但是,这些模型仍然具有一些内在的弱点,例如需要将图像建模为1D序列,在采样过程中误差的积累以及与变压器相关的显着推理时间。去核扩散概率模型是一类非自动回旋生成模型,最近显示出可以在计算机视觉中产生出色的样品(超过生成的对抗网络),并实现与变压器具有竞争力同时具有快速推理时间的对数可能性。扩散模型可以应用于自动编码器学到的潜在表示,使其易于扩展,并适用于高维数据(例如医学图像)的出色候选者。在这里,我们提出了一种基于扩散模型的方法,以检测和分段脑成像中的异常。通过在健康数据上训练模型,然后探索其在马尔可夫链上的扩散和反向步骤,我们可以识别潜在空间中的异常区域,因此可以确定像素空间中的异常情况。我们的扩散模型与一系列具有2D CT和MRI数据的实验相比,具有竞争性能,涉及合成和实际病理病变,推理时间大大减少,从而使它们的用法在临床上可行。
translated by 谷歌翻译
随着工程系统的复杂性的增长,对自动方法的需求越来越多,可以检测,诊断甚至正确的瞬时异常,这些异常不可避免地会出现,并且可能难以或不可能手动诊断和修复。在我们文明的最敏感和最复杂的系统中,探测器在引力波引起的距离中寻找令人难以置信的很小的变化 - 阿尔伯特·爱因斯坦(Albert Einstein)最初预测的现象是由于黑洞和其他其他碰撞而在宇宙中涌现和传播的探测器。深空中的大量物体。此类探测器的极端复杂性和精度使它们受到瞬时噪声问题的影响,这些问题可能会大大限制其敏感性和有效性。在这项工作中,我们介绍了一种可以检测和表征这种大规模复杂系统的新兴瞬态异常的方法的演示。我们通过一个普遍的问题之一来说明自动化解决方案的性能,精度和适应性,限制重力波发现:陆地质量造影,污染了重力波观测体的高度敏感测量,并可以模仿甚至模仿的天体物理学信号他们正在听。具体而言,我们证明了高度可解释的卷积分类器如何自动学习从辅助探测器数据中检测瞬时异常,而无需观察异常本身。我们还说明了该模型的其他几个有用的功能,包括如何执行自动变量选择,以将数万个辅助数据渠道降低到只有几个相关的数据渠道;它如何识别这些通道中异常情况的行为特征;以及如何使用它来研究单个异常及其相关的渠道。
translated by 谷歌翻译
随着我们感知增强的能力,我们正在经历从数据贫困问题的过渡,其中中心问题是缺乏相关数据,即数据越来越多的问题,其中核心问题是确定一个中的一些相关功能海洋观察。通过在重力波天体物理学中应用的激励,我们研究了从检测器及其环境中丰富的测量值收集的引力波检测器中瞬时噪声伪影的存在。我们认为,功能学习 - 从数据中优化了哪些相关功能 - 对于实现高精度至关重要。我们引入的模型将错误率降低60%以上,而不是先前使用固定的手工制作功能的最新现状。功能学习不仅有用,因为它可以提高预测任务的性能;结果提供了有关与感兴趣现象相关的模式的宝贵信息,否则这些现象将是无法发现的。在我们的应用程序中,发现与瞬态噪声相关的功能提供了有关其起源的诊断信息,并建议缓解策略。在高维环境中学习具有挑战性。通过使用各种体系结构的实验,我们确定了成功模型中的两个关键因素:稀疏性,用于在高维观测中选择相关变量;和深度,这赋予了处理复杂相互作用和相对于时间变化的鲁棒性的灵活性。我们通过对实际检测器数据进行系统的实验来说明它们的意义。我们的结果提供了对机器学习社区中常见假设的实验性佐证,并具有直接适用于提高我们感知引力波的能力以及许多其他具有类似高维,嘈杂或部分无关数据的问题的问题。
translated by 谷歌翻译
We present a dynamic path planning algorithm to navigate an amphibious rotor craft through a concave time-invariant obstacle field while attempting to minimize energy usage. We create a nonlinear quaternion state model that represents the rotor craft dynamics above and below the water. The 6 degree of freedom dynamics used within a layered architecture to generate motion paths for the vehicle to follow and the required control inputs. The rotor craft has a 3 dimensional map of its surroundings that is updated via limited range onboard sensor readings within the current medium (air or water). Path planning is done via PRM and D* Lite.
translated by 谷歌翻译
Nine language-vision AI models trained on web scrapes with the Contrastive Language-Image Pretraining (CLIP) objective are evaluated for evidence of a bias studied by psychologists: the sexual objectification of girls and women, which occurs when a person's human characteristics are disregarded and the person is treated as a body or a collection of body parts. A first experiment uses standardized images of women from the Sexual OBjectification and EMotion Database, and finds that, commensurate with prior research in psychology, human characteristics are disassociated from images of objectified women: the model's recognition of emotional state is mediated by whether the subject is fully or partially clothed. Embedding association tests (EATs) return significant effect sizes for both anger (d >.8) and sadness (d >.5). A second experiment measures the effect in a representative application: an automatic image captioner (Antarctic Captions) includes words denoting emotion less than 50% as often for images of partially clothed women than for images of fully clothed women. A third experiment finds that images of female professionals (scientists, doctors, executives) are likely to be associated with sexual descriptions relative to images of male professionals. A fourth experiment shows that a prompt of "a [age] year old girl" generates sexualized images (as determined by an NSFW classifier) up to 73% of the time for VQGAN-CLIP (age 17), and up to 40% of the time for Stable Diffusion (ages 14 and 18); the corresponding rate for boys never surpasses 9%. The evidence indicates that language-vision AI models trained on automatically collected web scrapes learn biases of sexual objectification, which propagate to downstream applications.
translated by 谷歌翻译
Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.
translated by 谷歌翻译
We apply the vision transformer, a deep machine learning model build around the attention mechanism, on mel-spectrogram representations of raw audio recordings. When adding mel-based data augmentation techniques and sample-weighting, we achieve comparable performance on both (PRS and CCS challenge) tasks of ComParE21, outperforming most single model baselines. We further introduce overlapping vertical patching and evaluate the influence of parameter configurations. Index Terms: audio classification, attention, mel-spectrogram, unbalanced data-sets, computational paralinguistics
translated by 谷歌翻译
Common to all different kinds of recurrent neural networks (RNNs) is the intention to model relations between data points through time. When there is no immediate relationship between subsequent data points (like when the data points are generated at random, e.g.), we show that RNNs are still able to remember a few data points back into the sequence by memorizing them by heart using standard backpropagation. However, we also show that for classical RNNs, LSTM and GRU networks the distance of data points between recurrent calls that can be reproduced this way is highly limited (compared to even a loose connection between data points) and subject to various constraints imposed by the type and size of the RNN in question. This implies the existence of a hard limit (way below the information-theoretic one) for the distance between related data points within which RNNs are still able to recognize said relation.
translated by 谷歌翻译